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Abstract. We show how to build an alphabetic minimax tree for a sequence W = w\ , . . . , w n of 
real weights in 0(nd log log n) time, where d is the number of distinct integers \wi\ . We apply this 
algorithm to building an alphabetic prefix code given a sample. 
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1. Introduction 



For the alphabetic minimax tree problem, we are given a sequence W = w\ , . . . , w n of weights and an 
integer t > 2 and asked to find an ordered t-ary tree on n leaves such that, if the depths of the leaves 
from left to right are t\, ... ,l n , then maxi<i< ra {u;i + 4} is minimized. Such a tree is called a i-ary 
alphabetic minimax tree for W and the minimum maximum sum, a(W), is called the i-ary alphabetic 
minimax cost of W. 

Hu, Kleitman and Tamaki gave an 0(nlogn)-time algorithm for this problem when t is 2 or 
3. Under the assumption the tree must be strictly t-aiy, Kirkpatrick and Klawe JH gave 0(n) -time and 
0(n log n)-time algorithms for integer and real weights, respectively, which they applied to bounding 
circuit fan-out. Coppersmith, Klawe and Pippenger [3] modified Kirkpatrick and Klawe's algorithms 
to work without the assumption, and again applied them to bounding circuit fan-out. Kirkpatrick and 
Przytycka [9] gave an 0(log n)-time, 0(n/ log n)-processor algorithm for integer weights in the CREW 
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PRAM model. Finally, Evans and Kirkpatrick [5] gave an 0(n)-time algorithm for the problem with 
integer weights in which we want to find a binary tree that minimizes the maximum over i of the sum 
of the «th weight and the ith node's (rather than leaf's) depth, and applied it to restructuring ordered 
binary trees. In this paper, we give an 0{nd log log n)-time algorithm for the original problem with real 
weights, where d is the number of distinct integers \wi\. Our algorithm can be adapted to work for any 
t but, to simplify the presentation, we assume t = 2 and write log to mean log 2 - 

2. Motivation 

Our interest in alphabetic minimax trees stems from a problem concerning alphabetic prefix codes, i.e., 
prefix codes in which the lexicographic order of the codewords is the same as that of the characters. 
Suppose we want to build an alphabetic prefix code with which to compress a file (or, equivalently, a 
leaf-oriented binary search tree with which to sort it), but we are given only a sample of its characters. 
Let P = pi, . . . , p n be the normalized distribution of characters in the file, let Q = q\ , . . . , q n be the 
normalized distribution of characters in the sample and suppose our codewords are C = c\, . . . , c n . 
An ideal code for Q assigns the ith character a codeword of length log(l/gj) (which may not be an 
integer), and the average codeword's length using such a code is H(P) + D(P\\Q), where H(P) = 
^2iPi log(l/pj) is the entropy of P and D(P\\Q) = ^iPi \og{pi/qi) is the relative entropy between P 
and Q. 

Consider the best worst-case bound we can achieve on how much the average codeword's length 
exceeds H(P) + D(P\\Q). As long as qi > whenever pi > 0, the average codeword's length is 



(if qi = but pi > for some i, then our formula is undefined). Notice each |cj| is the length of the ith 
branch in the trie for C. Therefore, the best bound we can achieve is 



and we achieve it when the trie for C is an alphabetic minimax tree for log q%, . . . , log q n . 

In several reasonable special cases, we can build the alphabetic minimax tree for log qi, . . . , log q n 
in o(n log n) time. For example, if each pair qi and qj differ by at most a multiplicative constant — 
a case Klawe and Mumey ifTOl considered when building optimal alphabetic prefix codes — then each 
pair log qi and log qj differ by at most an additive constant, so the number of distinct integers [log q{\ is 
constant and our algorithm runs in 0[n log log n) time. 




H(P) + D(P\\Q) + 5>;(log % + \a\) 




min maxjlog qi + | Cj | } 
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3. Algorithm 

Let B = b±, . . . , b n be the values w\ — [w±\ ,w n — [w n \ sorted into nondecreasing order. Kirkpatrick 
and Klawe showed that, if i is the smallest index such that 

a(\wi-bi],..., \w n - bi\) =a(\wi-b n },..., \w n - b n ~\) , 

then a(W) = ( \w± — bi\, . . . , \w n — bi\) +bi and any alphabetic minimax tree for |"wi-6i] , . . . , \w n — 
bi] is an alphabetic minimax tree for W. Their 0(nlogn)-time algorithm for real weights is a simple 
combination of this fact, binary search and their 0(n)-time algorithm for integer weights: they compute 
and sort w\ — \ w\\ ,w n — \ w n \ to obtain B, compute an alphabetic minimax tree for the sequence 
\w\ — b n ~\, . . . , \w n — b n ~\ of integer weights, and use binary search to find bi\ for each step of the 
binary search, if the candidate value to be tested is bj, then they build an alphabetic minimax tree for 
the sequence \w\ — bf\ , . . . , \w n — bf\ of integer weights and compare a ( \w\ — bf\ , . . . , \w n — bj~\) 
to a (\w! —b n ], . . . , \w n - b n }). 

Our idea is to avoid sorting w\ — \ w\\ ,w n — [w n \ and then building an alphabetic minimax tree 
from scratch for each step of the binary search. To avoid sorting, we use a technique similar to the one 
Klawe and Mumey described for generalized selection; to avoid building the trees from scratch, we use 
a data structure based on Kirkpatrick and Przytycka's level tree data structure for W. Our data structure, 
which we describe in Section HJ stores W and X = x±, . . . ,x n = 0, . . . ,0 and performs any sequence 
of 0(n) of the following operations in 0{nd log log n) time: 

set(«) — set Xi to 1; 

undo — undo the last set operation; 

cost — return a ( \w{\ — x\ , . . . , \w n ~\ — x n ) . 

We first find b n = maxjjwj — \ wi\ } and then, using Kirkpatrick and Klawe's 0(n)-time algorithm, 
a ( \w\ — b n ~\ , . . . , \w n — b n ~\ ) . We build the multiset 5o = { (u>i — [wi\ , i) } and use binary search to 
find the smallest value Wi — [w-i\ such that 

a ( \wi - (wi - [wi\ )],... ,\w n - (v)i - [wi\ )] ) 
= a (\wi - b n ] , . . . , \w n - b n ]) . 

Once we have Wi — [wi\, we use Kirkpatrick and Klawe's 0(n)-time algorithm again to build an alpha- 
betic minimax tree for the sequence \w\ — (w-i — [wi])~\, . . . , \w n — (wi — [wi\)~\ of integer weights. 

For the kth step of the binary search, we use Blum et al.'s algorithm [2] to find the median of the 
first components in Sf.; we divide into 

{{wi - [wi\,i) : Wi - [wi\ < m k ) , 
{{wi - [wi\,i) : Wi - [wi\ = m k ] , 
{{wi - [wi\,i) : Wi - [wi\ > m k } ; 

for each second component j in S' k or S'l with Wj not an integer, we set Xj to 1 ; we compare a ( \w{\ — x% , 
. . . , \w n ~\ — x n ) to a ( \w\ — b n ~\ , . . . , \w n — b n ~\ ) ; if it is equal, then m k is still a candidate, so we undo 
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all the set operations we performed in this step and recurse on S' k ; if it is greater, then is too small, so 
we leave all the set operations and recurse on S'l'. The last candidate considered during the search is the 
value Wi — \ wi\ we want. For the fcfh step of the search, we spend 0(n/2 k ) time finding the median 
and dividing into S' k , S k ' and Si", and perform 0{n/2 k ) operations on the data structure. Summing 
over the steps, we use 0(n) time to find all the medians and divide all the sets and 0(nd log log n) time 
to perform all the operations on the data structure. 

Lemma 3.1. Given a data structure that performs any sequence of 0(n) set, undo and COSt operations 
in 0(nd log log n) time, we can build an alphabetic minimax tree for W in 0(nd log log n) time. 

4. Data structure 

If we define the weight of the ith leaf of an alphabetic minimax tree for W to be Wi, and the weight 
of each internal node to be the maximum of its children's weights plus 1, then the weight of the root is 
a(W). We would like to use this property to recompute a ( \w{\ — x\, . . . , \w n ~\ — x ra ) efficiently after 
updating X, but even small changes can greatly affect the shape of the alphabetic minimax tree: e.g., 
suppose n = 2 k + 1, each Wi = k — 1/2 and each x-i = 0; if we set x\ and X2 to 1 then, in the unique 
alphabetic minimax tree for 

\w{\ - xi, . . . , \w n ~\ - x n = k - 1, k - 1, k, . . . , k , 

every even-numbered leaf except the second is a left-child; but if we instead set x n -\ and x n to 1 then, 
in the unique alphabetic minimax tree for 

\w{\ - xi, . . . , \w n ~\ - x n = k, . . . , k, k - 1, k - 1 , 

every even-numbered leaf except the (n — l)st is a right-child. 

Fortunately for us, Kirkpatrick and Przytycka defined a data structure, called a level tree, that repre- 
sents an alphabetic minimax tree but whose shape is less volatile. Let 

Y = y 1 ,...,y n = [wt] - xi, . . . , \w n ] - x n , 

and consider their definition of the level tree for Y (we have changed their notation slightly to match our 
own): 

"We start our description of the level tree with the following geometric construction (see 
Figured]): Represent the sequence of weights Y by a polygonal line; for every i = 1, . . . , n 
draw on the plane the point (i, yj), and for every i = 1, . . . , n — 1 connect the points (i, yi) 
and (i + L, j/i+i); for every i such that yi > yi + i (resp., yi > yi-\) draw a horizontal line 
going from (i, yi) to its right (resp., left) until it hits the polygonal line. The intervals defined 
in such a way are called the level intervals. We also consider the interval [(0, oo), (n+1, oo)] 
and the degenerate intervals [(i, yi), (i, yi)] as level intervals. Let e be a level interval. Note 
that at least one of e's endpoints is equal to (i, yi) for some index i. ... We define the level 
of a level interval to be equal to [the second component of points belonging to that interval]. 

Note that an alphabetic minimax tree can be embedded in the plane in such a way that 
the root of the tree belongs to the level interval [(0, oo), (n + 1, oo)] and that internal nodes 
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whose weights are equal to the weight of one of the leaves belong to the horizontal line 
through this leaf. Furthermore, if there is a tree edge cutting a level interval then adding 
a node subdividing this edge to the alphabetic minimax tree does not increase the weight 
of the root. By this observation we can consider alphabetic minimax trees which can be 
embedded in the plane in such a way that all edges intersect level intervals only at endpoints 
(see Figure [2]). 

The level tree for Y is the ordered tree whose nodes are in one-to-one correspondence 
with the level intervals defined above. The parent of a node v is the internal node which 
corresponds to the closest level interval which lies above the level interval corresponding 
to v. The left-to-right order of the children of an internal node corresponds to the left-to- 
right order of the corresponding level intervals on the plane (see Figure [3]). For every node 
u of a level tree we define load (it) to be equal to the number of nodes of the constructed 
alphabetic minimax tree which belong to the level interval corresponding to u (assuming the 
above embedding). 

If u is a leaf then load(u) = 1. Assume that u is an internal node and let u\, ■ ■ ■ , 
be the children of u. Let A u denote the minimum of the value [log n] and the difference 
between the level of the level interval corresponding to node u and the level of the intervals 
corresponding to its children. It is easy to confirm that 

rioad(iti)H hload(u fc )l „ 

load(u) = ^ . 

Notice that, if u is the root of the level tree and u\ , . . . , Uk are its children, then Kirkpatrick and Przytycka 
embed load(ui) + ■ ■ ■ + load(ufe) nodes of the alphabetic minimax tree into the intervals correspond- 
ing to u\, . . . , Uk- It follows that a(Y) is the level of the intervals corresponding to u\, . . . , Uk plus 
|~log(load(iii) H h load(itfc))] . 

It is straightforward to build the level tree for Y in 0(n) time, by first building an alphabetic minimax 
tree for it. Moreover, if we set a bit x\ to 1 and thus decrement yi, then the shape of the level tree for Y 
and the loads change only in the vicinity of the ith leaf and along the path from it to the root. The number 
of levels is the number of distinct weights in Y plus one, so the length of that path is 0(d) (recall d is 
the number of distinct integers \wi]). Unfortunately, the level tree can have very high degree, so we may 
not be able, e.g., to navigate very quickly from the root to a leaf. 

We store a pointer to the root of the level tree and an array of pointers to its leaves, and pointers from 
each node to its parent. At each internal node, we store its children in a doubly-linked list (so each child 
points to the siblings immediately to its left and right). It is not hard to verify that, with these pointers, 
we can implement a cost operation in 0(1) time and reach all the nodes that need to be updated for a set 
operation in 0(d) time. We cannot implement set operations in 0(d) worst-case time, however, because 
of the following case (see Figure 0]): suppose the siblings u\ and U2 immediately to the left and right of 
the ith leaf v are internal nodes whose children belong to level intervals with level — 1; if we set X{ 
to 1 and thus decrement yi and u's level, then ui's former children, v and «2's former children will all 
have the same parent (either a new node u if v had siblings other than u\ and ui, as shown in Figure [4] 
or their former parent if it did not). 

To deal with this case, we store all the internal nodes of the level tree in a union-find data structure, 
due to Mannila and Ukkonen lfl2l . that supports a deunion operation. Rather than adjusting all of u\% 
and «2's former children to point to their new parent, we simply perform a union operation on u\ and u-i- 
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Figure 3. The level tree for 4, 5, 2, 2, 2, 1, 2, 3, 6, 4, with internal nodes' loads shown. 
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Figure 4. Decrementing a node v's level can force us to combine its adjacent siblings u\ and ui into a new node 
it. 

Whenever we follow a pointer to an internal node, we perform a find operation on it and, if necessary, 
update the pointer. Each COSt operation on the level tree takes one find operation on the union-find 
data structure and 0(1) extra time, and each set operation takes at most one union operation, 0(d) 
find operations and 0(d) extra time. Whenever we make a modification to the level tree other than an 
operation on the union-find data structure, we push it onto a stack. To perform an undo operation on 
the level tree, we pop and reverse all the modifications we made since starting the last set operation and, 
if necessary, perform a deunion operation. Any sequence of 0(n) operations on the level tree takes 
0(nd) operations on the union-find data structure, which Mannila and Ukkonen showed take a total of 
0(nd log log n) time. 

Lemma 4.1. In 0(n) time we can build a data structure that performs any sequence of 0(n) set, undo 
and cost operations in 0(nd log log n) time. 

5. Conclusion 

Combining Lemmas l3.ll and |4~T1 we have the following theorem: 

Theorem 5.1. We can build an alphabetic minimax tree for W in 0(nd log log n) time. 

Since d could be as small as 1 or as large as n, our theorem is incomparable to previous results. We 
can build the tree in O (nmin(dloglogn,logn)) time, of course, by first finding d in 0(n) time and 
then, depending on whether d log log n < log n, using either our algorithm or one of the 0(n log n)-time 
algorithms mentioned in Section Q] 

In closing, we note there has recently been interesting work involving unordered minimax trees. 
Baer HI observed that the problem of building a prefix code with mimimum maximum pointwise re- 
dundancy — originally posed and solved by Drmota and Szpankowski [4] — can also be solved with 
a Huffman-like algorithm, due to Golumbic |6], for building unordered minimax trees. Given a prob- 
ability distribution over n characters, Drmota and Szpankowski's algorithm takes 0(n log n) time, or 
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0(n) time if the probabilities are sorted by the fractional parts of their logarithms; we conjecture that, by 
using Blum et al.'s algorithm as we did in this paper, it can be made to run in 0(n) time even when the 
probabilities are unsorted. Like Huffman's algorithm (see [11]), Golumbic's algorithm takes 0{n log n) 
time, or 0(n) time if the probabilities are sorted by their values. 
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